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Abstract. In the present study we assessed the utility of H3-Genesequences for phylogenetic reconstruction of the He- 
terobranchia (Mollusca, Gastropoda). Therefore histone H3 data were collected for 49 specics including most of the ma- 
jor groups. The sequence alignment provided a total of 246 sites of which 105 were variable and 96 parsimony informa- 
tive. Twenty-four (of 82) first base positions were variable as were 78 of the third base positions but only 3 of the sc- 
cond base positions. 


H3 analyses showed a high codon usage bias. The consistency index was low (0.210) and a substitution saturation was 
observed in the 3! codon position. The alignment with the translation of the H3 DNA sequences to amino-acid sequences 
had no sites that were parsimony-informative within the Hcterobranchia. 


Phylogenetic trees were reconstructed using maximum parsimony, maximum likelihood and Bayesian methodologies. 
Nodilittorina unifasciata was used as outgroup. 


The resolution of the deeper nodes was limited in this molecular study. The data themselves were not sufficient to clar- 
ify phylogenetic relationships within Heterobranchia. Neither the monophyly of the Euthyneura nor a step-by-step evo- 
lution by the “basal” groups was supported. A conclusion about the monophyly of Opisthobranchia and Pulmonata could 
not be extracted from our data because we did not have any resolution at this point. 


We believe histone H3 alone provides no new marker for studying deep molecular evolution of the Heterobranchia due 
to the high grade of conservation and the low phylogenetic signal. 


Surprisingly there was a good resolution on the genera levcl. Analyses conducted with maximum parsimony and Bayesian 
inference (using all data) recovered all (or nearly all) gencra mostly with statistically significantly supported nodes. Fur- 


ther studies focusing on the possible utility of histone H3 for the resolution of recent splits will be necessary. 
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1. INTRODUCTION 


Many questions regarding gastropod phylogeny have not 
yet been answered sueh as the moleeular eonfirmation of 
the Heterobranehia eoneept based on morphological stud- 
les from HASZPRUNAR (1985, 1988). This taxon contains 
the Pentaganghonata HASZPRUNAR, 1985 also known as 
Euthyneura SPENGEL, 1881 (with the Opisthobranehia and 
Pulmonata) and several mostly little known “basal” groups 
(e.g. Valvatoidea, Omalogyroidea, Arehitectonicoidea, 
Rissoelloidea and Pyramidelloidea) whieh present a step- 
by-step evolution towards the euthyneuran level of organ- 
isation (HASZPRUNAR 1988). The hyperstrophy of the pro- 
toconeh is the most important autapomorphous eharaeter 
of the Heterobranchia. The Euthyneura are eharaeterised 
by the presenee of two additional (so-called parietal) gan- 
glia. However, the monophyly of the Euthyneura has not 
been elarified by moleeular studies, yet. In some studies 
they are reeovered monophyletie (COLGAN et al. 2000, 


2003; KNUDSEN ct al. 2006) in others not ( THOLLESSON 
1999). The Pulmonata and Opisthobranehia can be sepa- 
rated by eharacters respeetive of the nervous system (pres- 
ence of a proeerebrum and eerebral bodies in pulmonates 
and presenee of a rhinophoral nerve in Opisthobranchia 
and Pyramidelloidea). However the moleeular eonfirma- 
tion regarding the monophyly of the Opisthobranehia 
(VONNEMANN ct al. 2005; GRANDE et al. 2004a) and the 
Pulmonata ( TILLIER et al. 1996, DAYRAT ct al. 2001) is still 
a matter of debate. There is no eomprehensive investiga- 
tion eoneerning the "basal" groups. Only a few represen- 
tative taxa (e.g. Valvatoidea — Cornirostra pellucida, Ar- 
chiteetonieoidea — Philippea lutea, Pyramidelloidea — 
Pvramidella dolabrata) have been ineluded in eurrent mo- 
leeular studies (COLGAN et al. 2000; GRANDE et al. 20042, 
2004b). 
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In recent years molecular systematic analyses in gas- 
tropods have utilised a varicty of genetic markers, e.g. nu- 
clear 28S ribosomal RNA and/or 18S ribosomal RNA or 
mitochondrial 16S ribosomal RNA and/or cytochrome ox- 
idase subunit I (TILLIER et al. 1994, 1996; DAYRAT et al. 
2001; VONNEMANN et al. 2005; THOLLESSON et al. 1999; 
REMIGIO & HEBERT 2003). Nevertheless, new genetic 
markers are needed for the resolution of certain phyloge- 
nctic relationships (especially regarding deeper nodes). 
Partial fragments of the gene coding for the extremely con- 
servative H3 protein (MAXSON et al. 1983) were first used 
to clarify arthropod molecular evolution (COLGAN et al. 
1998) and later polychaete (BROWN et al. 1999), gastro- 
pod (COLGAN et al. 2000, 2003), polyplacophoran (OKUSU 
ct al. 2003), cephalopod (LINDGREN et al. 2004) and hexa- 
pod (KIER et al. 2006) phylogeny. All studies used a com- 
bined dataset in their approaches. In their study of gas- 
tropod phylogeny, Colgan et al. (2000) did not find a 
monophyletic Heterobranchia while within the Euthyncu- 
ra, the Opisthobranchia arc paraphyletic with respect to 
the pulmonates. Very similar phylogenetic relationships 
were shown in COLGAN et al. (2003). The Hcterobranchia 
as well as the Opisthobranchia and Pulmonata arc rarely 
recovered as monophyletic in these studics. 


In the present study we wanted to test the utility of H3 
gene sequences for phylogenetic reconstruction within the 
Heterobranchia (focusing primarily on the Opistho- 
branchia). We were especially interested in testing 
whether H3 is suitable to resolve deeper nodes within het- 
erobranch phylogeny. Therefore, partial histone H3 data 
wcre collected for 49 species including most of the ma- 
jor groups (Euthyneura with Opisthobranchia and Pul- 
monata and ^basal" groups with Valvatoidea, Architecton- 
icoidea, Omalogyroidea, Rissoelloidea and Pyramidel- 
loıdea). 


2. MATERIALS AND METHODS 


2.1. Specimens and DNA extraction 


The studied taxa and the accession numbers are listed in 
Table 1. Twenty of the 49 sequences are taken from Gen- 
Bank. Opisthobranchia are represented by 26 species (in- 
cluding 11 suborders). Nodilittorina unifasciata 
(Caenogastropoda Cox, 1960) was used as an outgroup. 


DNA was extracted from ethanol-preserved individuals us- 
ing the DNeasy Tissue Kit from Qiagen (Hilden, Ger- 
many). 


Table I . Taxonomie positions and collecting locations of the sampled taxa. Accession numbers of sequenees ineluded in the ana- 
lyses (ZSM = Zoologisehe Staatssammlung); published sequences taken from GenBank are marked with an asterisk. 


Major Taxon Species 
Caenogastropoda 
Littorinoidea 
Littorintdae 
Conidae 
Campanilidae 


Nodilittorina nifasciata (Gray, 1826) 
Conns miles Linnaeus, 1758 
Campanile symbolicnin Iredale, 1917 


Opisthobranchia 


Nudibranchia 

Tethydidae Tethys fimbria Linne, 1767 
Discodorididae Discodoris atromacnlata (Bergh, 1880) 
Arminidae Arming neapolitana (Delle Chiaje, 1824) 


Pleurobranchoidea 


Pleurobranchidae Plenrobranchaca meckeli Leue, 1813 


Tylodinoidea 

Umbraculidae Umbraculum umbraculi (Lightfoot, 1786) 
Cephalaspidea 

Scaphandridae Scaphander lignarins (Linné, 1758) 
Philinidae Pliline aperta (Linnaeus, 1767) 
Gastropteridae Gastropteron meckeli Kosse, 1813 
Anaspidea 

Akeridae Akera bullata Müller, 1776 

Aplystidae Aplysia californica Cooper, 1863 
Aplysitdae Aplysia cf. juliana Quoy & Gaimard, 1832 
Aplysitdae Bnrsatella leachii de Blainville, 1817 
Thecosomata 

Cavoliniidae Clio pyramidata Linné, 1767 


Creseidae 
Gymnosomata 
Pneumodermatidae 


Creseis sp. 


Pnemnoderma cf. atlantica (Oken, 1815) 


Locality GenBank Accession Number 
Genbank AF033705* 
Genbank AF033684* 
Genbank AF033683* 
Blanes, Spain EF133468 
Genbank DQ280013* 
Banyuls-sur-Mer, France EF133469 
Blanes, Spain EF133470 
Atlantic Ocean, Meteor Bank EF133471 
Blanes, Spain EF133472 
Genbank DQ093508* 
Blanes, Spain EF133473 
Kattegat, Denmark EF 133474 
Miami, USA EFI33475 
Genbank AF033675" 
Dingo Beach, Australia EF133476 
Canary Islands; Spain EFI33477 
Genbank DQ280012* 
USA, Atlantic EF133478 
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Major Taxon Species 

Sacoglossa 

Placobranchidae Elysia timida (Risso, 1818) 
Placobranchidae Elysia pusilla (Bergh, 1872) 
Placobranchidae Elysia crispata Morch, 1863 
Placobranchidae Elysia viridis (Montagu, 1804) 
Cylindrobullidae Cylindrobulla beauii Fischer, 1857 
Acochlidia 

Hedylopsidae Hedylopsis spiculifera (Kowalewsky, 1901) 
Microhedylidae Unela glandulifera (Kowalewsky, 1901) 
Architectibranchia 

Hydatinidae Micromelo undatus (Bruguiere, 1792) 
Acteonoidea 

Acteonidae Pupa solidula (Linné, 1758) 

Acteonidae Rictaxis punctocaelatus (Carpenter, 1864) 
Bullinidae Bullina lineata (Gray, 1825) 

Pulmonata 

Systellommatophora 

Onchidiidae Onchidium sp. 

Onchidiidae Onchidella floridana (Dall, 1885) 
Onchidiidae Onchidella sp. 

Stylommatophora 

Charopidae Hedleyoconcha delta (Pfeiffer, 1857) 
Siphonariidae Siplionaria serrata (Fischer, 1807) 
Siphonariidae Siphonaria concinna Sowerby,1824 
Siphonariidae Siphonaria zelandica (Quoy & Gaimard, 1832) 
Amphibolidae Salinator solida (Schacko, 1878) ` 
Eupulmonata 

Ellobiidae Ophicardelus ornatus (Ferussac, 1821) 


“basal” Heterobranchia/Triganglionata 


Pyramidelloidea 

Pyramidellidae Turbonilla lactea (Linné, 1758) 
Pyramidellidae Turbonilla sp. 

Architectonicoidea 

Architectonicidae Heliacus variegatus (Gmelin, 1791) 
Architectonicidae Philippea lutea (Lamarck, 1822) 
Valvatoidea 

Cornirostridae Cornirostra pellucida (Laseron, 1954) 
Orbitestellidae Orbitestella vera Powell. 1940 
Orbitestellidae Orbitestella sp. 

Omalogyroidea 

Omalogyridae Omalogvra burdwoodiana Strebel, 1908 
Rissoelloidea 

Rissoelidae Rissoella elongatospira Ponder, 1966 
Rissoelidae Rissoella micra Finlay, 1924 
Rissoelidae Rissoella cvstophora Finlay, 1924 


2.2. DNA amplification and sequencing 


The following degenerated primers were used: H3-F: 5°- 
ATG GCT CGT ACC AAG CAG AC(ACG) GC-3' and 
H3-R: 5'-ATA TCC TT(AG) GGC AT(AG) AT(AG) GTG 
AC-3' (Colgan et al. 1998) and produced a 246 bp prod- 
uct. The PCR profile was as follows: 95 ?C for 5 min, fol- 
O2 a.by 35 cycles of 30 s at 95 °C, 25 s at 52 °C. d5 s 
at 72 °C and a final extension at 72 °C for 5 min and Taq 
Polymerase, recombinant from Invitrogen (Karlsruhe, 
Germany) was used. All products were purified using the 
OlAquick Gel Extraction Kit from Qiagen (Hilden, Ger- 
many) and sequenced in both directions with a CEQ 2000 
Beckmann Coulter using the CEQ DTCS Quick Start Kit 


Locality GenBank Accession Number 


Blanes, Spain EF 532179 
Genbank DQ534792* 
Genbank DQ534790* 
Genbank DQ534790* 
Florida, USA EF133480 


Rovinj, Croatia EF133481 


Rovinj, Croatia EF133482 
Genbank DQ093513* 
Dingo Beach, Australia EF133483 
Cayucos, Californica, USA EF133484 
Genbank AF033680* 
Genbank AF933706* 
Florida, USA EF133485 
Genbank DQ093511* 
Genbank AF033693* 
South Africa EF133486 
South Africa EFIS3437 
Genbank AFO33713* 
Genbank AFO33712* 
Genbank AF033707* 
Roscoff, France EF133488 
Wellington, New Zealand EF133489 
Tropical aquarium (ZSM 20012193) EF133490 
Genbank AF033708* 
Genbank AF033685* 
Wellington, New Zealand EF561623 
Leigh, New Zealand EF561624 
Antarctic (ZSM Mol-20021228) EF133491 
Wellington, New Zealand EE561622 
Wellington, New Zealand EF561620 
Wellington, New Zealand EF56162 1 


(Krefeld, Germany). First we sequenced only onc frag- 
ment per specimen. Moreover, to avoid mistakes accord- 
ing to the conservative character of the H3 gene, we se- 
quenced a random sample of 5 species a second time 
whereby no varieties could be detected. 


2.3. Sequence alignment 


Sequences were aligned manually using the software pack- 
age BioEdit version 7.0.5 (HALL 1999). The H3 DNA se- 
quences were translated into the amino acid sequences in 
GeneDoc version 2.6.002 (NICHOLAS & NICHOLAS 1997). 
The alignment is available from the authors upon request. 
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2.4. Statistical tests 


Codon usage statistics were calculated using GCUA ver- 
sion 1.2 (MCINERNEY 1997). The purpose of this function 
is to calculate the Number (N) of times a particular codon 
Is observed in an alignment and also to calculate the Rel- 
ative Synonymous Codon Usage (RSCU) values for the 
dataset. RSCU values define the number of timcs a par- 
ticular codon is observed relative to the number of times 
that the codon would be observed in the absence of any 
codon usage bias. Without any codon usage bias, the 
RSCU value would be 1.00. A codon that is used less fre- 
quently than expected will have a value of less than 1.00 
and a codon that is uscd more frequently than expected 
will havc a volume of more than 1.00 (MCINERNEY 1997). 


The degree of bias ($%/n) which is the sum of the 4? val- 
ucs for the individual amino acids divided by the total 
number of inferred residues (n) for thc combination of da- 
ta from all species (SHIELDS et al. 1988) was determined. 


The substitution saturation was calculated for all 3 codon 
positions using the method developed by XIA et al. (2003) 
implemented in the software package DAMBE version 
4.2.13 (XIA & XIE 2001). 


2.5. Phylogenetic reconstruction 


Appropriate models for the analyses were selected after 
running Modcltest version 3.4 (POSADA & CRANDALL 
1998) and using the Akaike information criterion (AIC) 
(sce tab. 2). 


The following analyses werc conducted using PAUP* ver- 
sion 4.0 b10 (Sworrorp, 2002) (settings: heuristic search 
strategy; tbr; gaps were treatcd as fifth bases): a) Maxi- 
mum parsimony for all data and b) Maximum likelihood 
for all data. 


Table 2.Information on used models. 


Codon-Position Model 

Ist Position GIR 
2nd Position TVMef*I 
3" Position TVM+G 
Ist and 2" Position GTR+1+G 
|t, 2nd and 3d Position GTR+1+G 


Gamma distribution shape parameter 
acequal 
a=equal 
a=0.8835 
(0.0167 


o=1.0265 


Bootstrapping (FELSENSTEIN 1985) was performed for 
maximum parsımony with 1000 replicates and for maxi- 
mum likelihood with 100 replicates. 


The following analyses were conducted using MrBayes 
version 3.1.2 (RoNQUIST & HUELSENBECK 2003): Bayesian 
inference: a) all data (with one model for all three codon 
positions), b) all data (with codon specific models) and 
c) only codon position one and two (third codon position 
excluded; with one model for codon position one and two). 


For Bayesian inference a Metropolis Chain Monte Carlo 
analysis with four chains and 1 000 000 generations was 
performed with the first 1000 trees ignored as burn-in. 


3. RESULTS 


3.1. Statistical tests 


The sequences provided a total of 246 sites of which 105 
were variable and 96 parsimony informative. Twenty-four 
(of 82) first base positions were variable as were 78 of the 
third base positions but only 3 of the second base posi- 
tions. Insertion/delction events (indels) were not observed 
in any of the groups. The amino acid alignment had no 
sites that were parsimony-informative within the Hetero- 
branchia. 


H3 analyses showed a high codon usage bias (Tab. 3). The 
bias was principally against the usc of A and U in the third 
codon position. ¥2 tests were performed for all amino acids 
and revcaled that the null hypothesis which is the expect- 
ed equal usage of thc codons can be rejected for all amino 
acids with a significance level of 0,001 excepting histi- 
dine (p<0,05). For aspartic acid the null hypothesis can 
not be rejected (p-0.21). The degree of bias (Yy?/n) 
showed a high valuc of 0,617. 


Proportion of invariable sites 
Pinvar=0.5692 
Pinvar-0.8647 
Pinvar-equal 
Pinvar=0.6909 


Pinvar=0.5408 
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Table 3. Codon Usage Bias. N = number of times a particular eodon is observed in a dataset (alignment). RSCU values num- 
ber of times a particular codon is observed, relative to the number of times that the codon would be observed in the absence of any 


eodon usage bias. Amino acids (AA) are indieated by the three letter abbreviations. 


AA Codon N RSCU AA Codon N RSCU 
Phe UUU 6 (0.08) Ser UCU 45 (1.38) 
Phe UUC 142 (1.92) Ser DOC 34 (1.04) 
Leu UUA 3 (0.05) Ser UCA 13 (0.40) 
Leu UUG 43 (0.66) Ser UCG 8 (0.24) 
Tyr UAU E (0.04) Cys UGU 0 (0.00) 
Tyr UAC 96 (1.96) Cys UGC 0 (0.00) 
M UAA 0 (0.00) Trp UGG 0 (1.00) 
hat UAG 0 (0.00) Pro CCU 83 (1.36) 
Her UGA 0 (0.00) Pro CEC 107 (1.75) 
Leu CUU 48 (0.74) Pro CCA 42 (0.69) 
Leu CUC 81 (1.24) Pro CCG E (0.21) 
Leu CUA 2 (0.03) Arg CGU 233 (2.59) 
Leu CUG 214 (3.28) Arg CGE 127 (1.41) 
His CAU 31 (177) Arg CGA 18 (0.20) 
His CAC 18 (0.73) Arg CGG 9 (0.10) 
Gln CAA 4] (0.28) Thr ACU 53 (0.87) 
Gln CAG 252 21572) Thr ACC 146 (2.38) 
ME AUU 2] (0.43) Tür ACA 45 (0.73) 
Ile AUC 126 (2:57) Thr ACG l (0.02) 
Ile AUA 0 (0.00) Ser AGU 8 (0.24) 
Asn AAU 0) (0.00) Ser AGC 88 (2.69) 
Asn AAC 0 (0.00) Arg AGA 74 (0.82) 
Lys AAA 155 (0.61) Arg AGG 78 (0.87) 
Lys AAG 306 (1.39) Ala GCU 201 (1.49) 
Val GUU 9 (0.18) Ala GCC 279 (2.07) 
Val GUC 89 (1.82) Ala GCA 47 (0.35) 
Val GUA 4 (0.08) Ala GCG 12 (0.09) 
Val GUG 94 (1.92) Gly GGU 32 (0.87) 
Asp GAU 38 (0.78) Gly GGC 38 (1.03) 
Asp GAC 60 (1,22) Gly GGA 72 (1.96) 
Glu GAA 69 (0.70) Gly GGG 5 (0.14) 

Met AUG 49 (1.00) 


Glu GAG 128 (1.30) 


The consistency index in maximum parsımony analyses 
was low (0,210) as well as the retention index (0.444). Us- 
ing the method developed by XIA et al. (2003) a substi- 
tution saturation was observed in the third codon position 
(I; 0,543 > I... 0,298). To support this observation the en- 
tropy for each position in the alignment was calculated us- 
ing BioEdit Version 7.0.5.2. Codon position one showed 
an average entropy value of 0,09 whereas the value for 
the second codon position was 0,009 and 0,69 for the third 
codon position. If the nucleotides occur more or less equal- 
ly at a certain position within the alignment the entropy 
is highest with the value of 1,36. 


3.2. Phylogenetic analyses 


The maximum parsimony 50 % majority-rule consensus 
tree (Fig. 1) showed all genera (Ap/ysia, Elysia, Onchidel- 
la, Siphonaria, Orbitestella, Turbonilla and Rissoella) re- 
covered as monophyletic. However, some of the bootstrap 
supports were low and there was no bootstrap support for 
a monophyletic Rissoella. Beyond the genera level, only 
Architectonicoidea were detected as monophyletic. All 
other nodes lacked support. 


The genera Orbitestella, Turbonilla, Onchidella and 
Aplysia and again, the Architectonoicoidca were found to 
be monophyletic in the maximum likelihood analyses. The 
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Fig. 1. 50 % majority-rule consensus tree of maximum parsimony of 14 most parsimonious trees based on histone H3 data set 
(nucleotides), number of parsimony informative characters = 96, consistency index (CI) = 0,210. 
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Fig. 2. 50 % majority rule consensus Bayesian inference cladogram for the histone H3 datasct (based on nucleotides), Bayesian 
posterior probabilities provided at the branches. 
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Fig. 3. 50 % majority rule consensus Bayesian inference phylogram for the histone H3 dataset (based on nucleotides, 319 codon 
position excluded); Bayesian posterior probabilities provided at the branches. 
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basis of the 50 % majority-rule bootstrap tree (tree not 
shown) resembled a comb. There was no resolution of the 
deep nodes. 


The 50 % majority-rule consensus Bayesian inference 
cladogram (with one model for all three codon positions) 
(Fig. 2) also recovered all genera as monophyletic. Only 
the Bayesian posterior probability for Siphonaria was low 
(0.58) because only values above 0.95 are statistically sig- 
nificant. Beside the genera level Architectonicoidea were 
monophyletic and the Caenogastropoda formed a clade to- 
gether with Omalogyra. All other nodes had no statisti- 
cally significant support. 


The 50 % majority-rule consensus Bayesian inference 
cladogram (with codon specific models) (tree not shown) 
was quite similar to Figure. 2. All genera except Rissoel- 
la and Architectonicoidea were found to be monophyle- 
tic. The remaining nodes were supported by Bayesian pos- 
terior probabilities below 0.95. 


The 50 % majority rule consensus Bayesian inference phy- 
logram (with 3d codon position excluded and with one 
model for codon position one and two) (Fig. 3) recovered 
only the genera Rissoella and Turbouilla as monophylet- 
ic while the Caenogastropoda together with Omalogyra 
were grouped separately from the rest of the taxa. All oth- 
er nodes had no statistically significant support. 


4. DISCUSSION 


Molecular investigations of deep-level relationships with- 
in the Gastropoda have been made difficult due to a lack 
of slowly evolving genes. Hence, a number of different 
markers have been utilized to solve this problem. Analy- 
ses of nuclear genes like the 28S ribosomal RNA and/or 
the 185 ribosomal RNA have provided a number of im- 
portant insights into gastropod relationships at several lev- 
els (TILLIER et al. 1994, 1996; DAYRAT et al. 2001; Von- 
NEMANN et al. 2005). The same applies to mitochondrial 
genes like the 16s ribosomal RNA (THOLLESSON et al. 
1999) or the cytochrome oxidase subunit 1 (REMIGIO & 
HEBERT 2003). COLGAN et al. (2000, 2003) used the his- 
tone H3 protein in combination with other gencs to clar- 
ify gastropod molecular evolution. However, many aspects 
of gastropod phylogeny remain unclear such as the mo- 
lecular confirmation of the Heterobranchia concept based 
on morphological studies from HASZPRUNAR (1985, 
1988). At the moment there is no comprehensive molec- 
ular study of heterobranch phylogeny especially one in- 
cluding the “basal” taxa (e. g. Pyramidelloidea, Architec- 
tonicoidea, Valvatoidea, Omalogyroidea and Rissoel- 
loidea). 


In this study we wanted to present a primary molecular 
insight into heterobranch phylogeny while simultaneous- 
ly testing the utility of the gene coding for the highly con- 
served protein histone H3 for resolving the deeper nodes 
within this taxon. 


Unfortunately, thc present study did not provide a robust 
phylogenetic hypothesis for the relationships among dif- 
ferent lineages of Heterobranchia based on 113-Gencse- 
quences. Neither the monophyly of the Euthyncura nor a 
step-by-step evolution by the “basal” groups was support- 
ed. A conclusion about the monophyly of Opisthobranchia 
and Pulmonata could not be extracted from our data be- 
cause we did not have any resolution at this point. 


The first to investigate the value of histone H3 were CoL- 
GAN et al. (1998). They wanted to combine small nuclear 
ribonucleic acid U2 data and histone H3 to investigate 
arthropod molecular evolution. However, partitioned da- 
ta for H3 and U2 were incongruent according to Incon- 
gruence Length Difference tests. Using H3 data only, 
anomalous nodes appeared in their phylogenies while 
some possessed decay indices of |. Therefore, their data 
were not sufficient to clarify relationships within major 
arthropod groups. 


BROWN et al. (1999) investigated the DNA sequence da- 
ta of 34 Polychaeta species for partial histone H3, U2 snR- 
NA and two segments of 288 rDNA (DI and D9-10 ex- 
pansion regions). When using H3 only, BROWN et al. 
(1999) found a lack of concordance with morphological 
results and argued that the inclusion of all H3 data 1s in- 
appropriate for the phylogenetic levels under investiga- 
tions. 


COLGAN et al. (2000) and later COLGAN et al. (2003) used 
partial histone H3 (327bp) for the investigation of gastro- 
pod phylogeny. In COLGAN et al. (2000), where the authors 
uscd 36 sequences of histone H3 only, using the chiton 
Ischuochiton australis as an outgroup, no clades were re- 
tained in the bootstrap analyses. H3 with the third codon 
position excluded, retained only the higher Vetigastropo- 
da (bootstrap support = 68 %). 


In COLGAN et al. (2003) in which H3 alone was used to 
recover phylogenetic relationships within Gastropoda, on- 
ly the clade of the Patellogastropoda with a support of 52 
% was recovered. When the third codon position was ex- 
cluded, none of the expected groups were recovered. 


Okuso et al. (2003) were the first to apply DNA sequence 
data to reconstruct the phylogeny of the molluscan class 
Polyplacophora. Their use of 59 sequences of histone H3 
rcsolved deeper nodes than the mitochondrial genes did 
while the strict consensus tree nested the two Gastropo- 
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da Fiviparus georgianus and Siphonaria pectinata with- 
in Polyplacophora. 


A combined approach to the phylogeny of Cephalopoda 
(Mollusca) using 18S rRNA, 28S rRNA, histone H3 and 
COI underscored the aim of the study presented by LIND- 
GREN et al. (2004). The strict consensus tree for the over- 
all optimal parameter set for 66 sequences of histone H3 
alone did not show monophyly for any elasses investigat- 
ed. 


KJER et al. (2006) investigated the molecular phylogeny 
of Hexapoda (supermatrix approach with 137 taxa; 
375bp). They only recovered one ordinal level node when 
using only H3. 


Aecording to our phylogenctic results and the results in 
the papers listed above histone H3 alone provides no new 
marker for studying deep molecular evolution of the Het- 
erobranchia duc to the high grade of conservation and the 
low phylogenetic signal for dceper nodes. 


There were several indices defined by the results of our 
statistical tests supporting this assumption. We observed 
a high eodon usage bias (sec tab. 2) in our alignment which 
was also indicated by an increasing frequency of C- and 
G-ending codons and fewer A- and U-ending codons 
(SHIELDS et al. 1988). High C+G content at silent sites re- 
flects the effect of selection (SHIELDS et al. 1988) while 
sclective constraints against certain eodons might reduee 
the amount of phylogenetic noise caused by synonymous 
substitution at cither first or third codon positions 
(BROWN et al. 1999). However, our data suggest that a bias 
in eodon usage will not necessarily be indicative of the 
phylogenetic utility of a sequence. Despite a high eodon 
usage bias our computed phylogenetic trees showed a poor 
resolution of the deeper nodes. An explanation for this 
could be that the pressure to obtain the favoured codon 
had partially obscurcd the phylogenctic signal. COLGAN 
et al. (1998, 2000) made similar observations and conclud- 
ed that apparent, high eodon-usage bias as found for the 
H3 data does not necessarily result in high phylogenetic 
consisteney for DNA scquences. In the studies presented 
by BROWN et al. (1999) a lack of agreement of the H3 
analyses with morphology occurs despite very high codon 
usage bias. They eoncluded that whilst selective con- 
straints may have redueed the absolute rate of synonymous 
substitutions, the pressure in favor of (homoplastic) resti- 
tution of the favoured eodon has at least a partially ob- 
scured phylogenetic signal. Hence, codon usage bias does 
not necessarily mean that a gene sequence will be phylo- 
genctieally useful. 


The degrec of bias (9x?) showed a high value of 0,617. 
lt was similar to the values observed in gastropods (0.60) 


(COLGAN et al. 2000) and polychaetes (0.665) (Brown et 
al. 1999) and higher than the values of Drosophila 
melanogaster (FITCH & STRAUSBAUGH 1993) and arthro- 
pods (0.37) (COLGAN et al. 1998). 


Another indication suggesting the problems of H3 as a 
marker for studying deep molecular evolution was the high 
grade of conservation indicated by the lack of parsimo- 
ny-informative sites in the amino acid alignment. 


Additional important cvidence was the observed substi- 
tution saturation at the third codon position. A saturation 
is caused by multiple-hits which render homoplasious 
changes. Homoplasy on the basis of saturation in substi- 
tution 1s one of the major problems in molecular phylo- 
genetics ( TILLIER et al. 1996). This problem generally be- 
comes more relevant at progressively higher taxonomic 
levels (BooRE & Brown 1998). If numerous substitutions 
occur at the same position, a hiding or completely eras- 
ing of the ancient phylogenetic signal could be the result 
(LoPEZ et al. 1999). [n order to avoid a decrease of the 
phylogenetic information contained in the sequenees, we 
excluded the third codon position in further analyses. 
However, this was the position with the most variable sites 
(78 of 82 positions) in our data set. With the exclusion 
there was no phylogenetic information left for a resolu- 
tion of the deeper nodes. Trees whieh resemble combs at 
the base resulted (see fig. 3). The entropy was calculated 
for each position in the alignment to further assess the in- 
fluence of the 3 codon positions. A high entropy value im- 
plies that the nuclcotides occur almost equally at this po- 
sition. within the alignment. An almost equal distribution 
of the nucleotides at one position indicates a less selec- 
tive constraint allowing a higher frequeney of substitu- 
tions. The average values for each of the three codon po- 
sitions indicate a less selective constraint for the third 
codon position hence supporting the previous result of this 
codon position being saturated. 


It is questionable if given a larger data set, the noise will 
cventually suecumb to the signal. There are few examples 
where the expansion of an ambiguous data set has result- 
ed in a convincing phylogeny (Boone & BROWN 1998). 


Surprisingly, there was a good resolution on genera lev- 
el. Analyses condueted with maximum parsimony and 
Bayesian inference (with all data) reeovered all (or near- 
ly all) genera mostly with statistically significant support- 
ed nodes. However, our findings should be consider pre- 
liminary and further studies ineluding more genera arc 
necessary to test the possible utility of histone H3 for the 
resolution of reeent splits. In other studies (COLGAN et al. 
1998; Okuso et al. 2003: LINDGREN et al. 2004) some (but 
not all) genera were found to be monophyletic but due to 
the question the authors intended to answer, their studies 
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in regards to taxon sampling lacked genera represented by 
more than one species. 


In conclusion, still slowly evolving genes for the resolu- 
tion of the deeper nodes in gastropod phylogeny are miss- 
ing. To test the Heterobranchia concept as outlined by 
HASZPRUNAR (1988) other markers have to be found with 
sufficient variability but no substitution saturation. 
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